A Data Structure and Integer Programming Technique to Facilitate Cell Suppression Strategies

نویسندگان

  • Colleen M. Sullivan
  • Errol G. Rowe
چکیده

The U.S. Bureau of the Census has the responsibility to collect data regarding economic sectors and to publish these data without violating confidentiality laws. Collected data contain sensitive data values that if directly published could identify an individual establishment's data. There are a number of methods available that prevent compromising the sensitive cells. These disclosure avoidance techniques include rounding, perturbation, and cell suppression, and are outlined in Cox, et al. (1986a).The Bureau's current practice is to protect any cell where n or fewer respondents make up k percent or more of a table cell's value (Zayatz, 1992). (The values of n and k are confidential.) Since rounding and perturbation are unsatisfactory for economic aggregate magnitude data (Cox, et al. 1986b), the Economic Divisions have always chosen a cell suppression technique to protect published tabular data. Instead of the sensitive data value appearing in the publication, a "D" appears in its place. However, in most cases, the sensitive data values could be derived from non-sensitive data because most data items are published in additive tables. Therefore, additional data values must be suppressed. These additional suppressed data values are commonly referred to as complementary suppressions. The objective adhered to by the Census Bureau in applying complementary suppressions is to minimize the sum of the data values chosen as complementary suppressions. Minimizing the cost incurred through complementary suppressions produces a publishable table with maximum data utility; that is, the greatest amount of usable data is provided. Furthermore, the Bureau uses complementary suppressions to ensure that a data user cannot estimate the value of a sensitive data cell within a predef'med interval. That is, when choosing complementary suppressions for some primary suppression with true value X, we ensure that it cannot be estimated within a smaller interval than [XL,X + U] where L is the amount of lower protection required by X, and U is the amount of upper protection required by X. Kelly, et al. (1991) discusses protection levels in greater detail. In recent years, the Economic Divisions of the Bureau have employed a cell suppression technique that utilizes network flow methodology. The origin of using graph theory in the disclosure avoidance area lies in Cox (1980), and Gusfield (1984). More recently, Cox, et al. (1986a) has outlined this methodology, and a more complete history is given in Greenberg (1990). A general outline of the minimum cost network flow problem and related methodology appears in Bazaraa & Jarvis (1977), and Gondran & Minoux (1984). The network flow system currently employed is implemented using the commercially available Minimum Cost Flow (MCF) program of Glover, Klingman and Mote. As described in its documentation, "MCF is a highly refined implementation of the upper bounded, revised primal simplex algorithm for linear programming." With this refined implementation, the primal simplex method can be performed directly on a network. Kennington and Helgason (1980) refer to this procedure as the "simplex on a graph" algorithm. Although M CF is computationally fast, it often oversuppresses due to the structure of the objective function (See Section 3). The ideal technique for choosing complementary suppressions is the integer programming routine outlined in Section 2. This routine, however, is computationally impractical for census tables. This paper discusses a hybrid technique outlined in Section 4 (using M CF and integer programming) that lessens the oversuppression problem without adding substantial computation time (see Section 5).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A mixed integer bi-level DEA model for bank branch performance evaluation by Stackelberg approach

One of the most complicated decision making problems for managers is the evaluation of bank performance, which involves various criteria. There are many studies about bank efficiency evaluation by network DEA in the literature review. These studies do not focus on multi-level network. Wu (Eur J Oper Res 207:856–864, 2010) proposed a bi-level structure for cost efficiency at the first time. In t...

متن کامل

An integrated model for multi-period fuel management and fire suppression preparedness planning in forests (Appreciated as the best paper of 14th International Industrial Engineering Conference)

Wildfires are of the forest-related disasters caused by inhumane factors.  Spreading of these fires is due to the increase of the density of flammable plants. Two important approaches to prevent this occurrence are fuel treatment and fire suppression resources preparedness. In this paper, a mixed integer programming model is proposed based on the covering location and assignment problems which ...

متن کامل

RESOLUTION METHOD FOR MIXED INTEGER LINEAR MULTIPLICATIVE-LINEAR BILEVEL PROBLEMS BASED ON DECOMPOSITION TECHNIQUE

In this paper, we propose an algorithm base on decomposition technique for solvingthe mixed integer linear multiplicative-linear bilevel problems. In actuality, this al-gorithm is an application of the algorithm given by G. K. Saharidis et al for casethat the rst level objective function is linear multiplicative. We use properties ofquasi-concave of bilevel programming problems and decompose th...

متن کامل

A New Nonlinear Multi-objective Redundancy Allocation Model with the Choice of Redundancy Strategy Solved by the Compromise Programming Approach

One of the primary concerns in any system design problem is to prepare a highly reliable system with minimum cost. One way to increase the reliability of systems is to use redundancy in different forms such as active or standby. In this paper, a new nonlinear multi- objective integer programming model with the choice of redundancy strategy and component type is developed where standby strategy ...

متن کامل

Estimating most productive scale size in DEA with real and integer value data

For better guiding a system, senior managers should have accurate information. Using Data Envelopment analysis (DEA) help managers in this objective. Thus, many investigations have been made in order to find the most productive scale size (MPSS) for the evaluating decision making units (DMUs). In this paper we consider this case where there exist subsets of input and output variables to be inte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002